Algorithm For Automatic Interpretation Of Noun Sequences

نویسنده

  • Lucy Vanderwende
چکیده

This paper describes an algorithm for automatically interpreting noun sequences in unrestricted text. This system uses broadcoverage semantic information which has been acquired automatically by analyzing the definitions ira an on-line dictionary. Previously, computational studies of noun sequences made use of hand-coded semantic information, and they applied the analysis rules sequentially. In contrast, the task of analyzing noun sequences in unrestricted text strongly favors an algorithm according to which the rules are applied in parallel and the best interpretation is determined by weights associated with rule applications. 1. INT RODUCT ION The inte~opretation of noun sequences (henceforth NSs, and also known as noun compounds or complex nominals) has long been a topic of research in natural language processing (NLP) (Finin, 1980; Sparck Jones, 1983; Leonard, 1984; Isabelle, 1984; Lehnert, 1988; and Riloff, 1989). The challenge in analyzing NSs derives from the semantic nature of the problem: their interpretation is, at best, only partially recoverable from a syntactic or a morphological analysis of NSs. To arrive at an interpretation of plum sauce which specifies that plum is the Ingredient of sauce, or of knowledge representation, specifying that knowledge is the Object of representation, requires semantic information for both the first noun (the modifier) and the second noun (the head). In this paper, we are concerned with interpreting NSs which are composed of two nouns, ira absence of the context in which the NS appears; this scope is similar to most of the studies mentioned above. The algorithm for interpreting a sequence of two nouns is intended to be basic to the algorithm for interpreting sequences of more than two nouns: each pair of NSs will be interpreted in turn, and the best interpretation forms a constituent which can modify, or be modified by, another noun or NS (see also Finin, 1980). There is no doubt that context, both intraand inter-sentential, plays a role in determining the correct interpretation of a NS, since the most plausible interpretation in isolation might not be the most plausible in context. It is, however, a premise of the present system that, whatever the context is, the interpretation of a NS is always available in the list of possible interpretations. A NS that is ah'eady listed in an on-line dictionary needs no interpretation because the meaning can be derived from its definition. In the studies of NSs mentioned above, the systems tbr interpreting NSs have relied on handcoded semantic information, which is limited to a specific domain by the sheer effort involved in creating such a semantic knowledge base. The level of detail made possible by hand-coding has led to the development of two main algorithms for the automatic interpretation of NSs: concept dependent and sequential rule application. The concept dependent algorithm (Finin, 1980) requires each lexical item to contain an index to the rule(s) which should be applied when that item is part of a NS; it has the advantage that only those rules are applied for which the conditions are met and each noun potentially suggests a unique interpretation. Whenever the result of the analysis is a set of possible interpretations, the most plausible one is determined on the basis of the weight which is associated with a role fitting procedure. The disadvantage of this approach is that this level of lexical information cannot be acquired automatically, and so this approach cannot be used to process unrestricted text. The algorithm for sequential rule application (Leonard, 1984) focuses on the process of determining which interpretation is the most plausible; the fixed set of rules are applied in a fixed order and the first rule for which the conditions are met results in the most plausible interpretation. This algorithm has the advantage that no weights are associated with the rules. The disadvantage of this approach is that the degree to which the rules are satisfied cannot be expressed, and so, in some cases, the most plausible

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AN-EUL method for automatic interpretation of potential field data in unexploded ordnances (UXO) detection

We have applied an automatic interpretation method of potential data called AN-EUL in unexploded ordnance (UXO) prospective which is indeed a combination of the analytic signal and the Euler deconvolution approaches. The method can be applied for both magnetic and gravity data as well for gradient surveys based upon the concept of the structural index (SI) of a potential anomaly which is relate...

متن کامل

Noun Compound Interpretation Using Paraphrasing Verbs: Feasibility Study

The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds’ semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the targ...

متن کامل

A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation

The automatic interpretation of noun-noun compounds is an important subproblem within many natural language processing applications and is an area of increasing interest. The problem is difficult, with disagreement regarding the number and nature of the relations, low inter-annotator agreement, and limited annotated data. In this paper, we present a novel taxonomy of relations that integrates p...

متن کامل

Automatic Interpretation of UltraCam Imagery by Combination of Support Vector Machine and Knowledge-based Systems

With the development of digital sensors, an increasing number of high-resolution images are available. Interpretation of these images is not possible manually, which necessitates seeking for practical, fast and automatic solutions to solve the environmental and location-based management problems. The land cover classification using high-resolution imagery is a difficult process because of the c...

متن کامل

Paraphrasing Verbs for Noun Compound Interpretation

An important challenge for the automatic analysis of English written text is the abundance of noun compounds: sequences of nouns acting as a single noun. In our view, their semantics is best characterized by the set of all possible paraphrasing verbs, with associated weights, e.g., malaria mosquito is carry (23), spread (16), cause (12), transmit (9), etc. Using Amazon’s Mechanical Turk, we col...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994